-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-32609: [CI] Add type checking infrastructure and CI workflow for type annotations #48618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
afe4699 to
30017ff
Compare
AlenkaF
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @rok for working on this! Splitting up the main PR sounds like a great idea to me.
I only have one small question, otherwise the changes look good to me.
|
Yes, let's wait for Raul 👍 |
raulcd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @rok sorry it took me a while to review with holidays, release, etcetera
|
@github-actions crossbow submit wheel-manylinux-2-28-cp311-cp311-amd64 |
|
Revision: 28fba3a Submitted crossbow builds: ursacomputing/crossbow @ actions-8a459250fe
|
compose.yaml
Outdated
| ARROW_S3: "OFF" | ||
| ARROW_SUBSTRAIT: "OFF" | ||
| ARROW_WITH_OPENTELEMETRY: "OFF" | ||
| PYARROW_TEST_ANNOTATIONS: "ON" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@raulcd do you think we actually want to be able to toggle these? I'm thinking we don't really need them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is worth it if instead of using a separate ci/scripts/python_test_type_annotations.sh we use ci/scripts/python_test.sh and we use the PYARROW_TEST_ANNOTATIONS variable to manage whether we want annotations to be tested or not as part of python testing.
If we decide we want to have two separate scripts I think this env var should be removed.
I lean towards having a single python test script that can allow us to also test annotations but I can find arguments for both so not an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, agreed with your reasoning. I also introduced PYARROW_TEST_ANNOTATIONS to the windows script.
|
Thanks for review @raulcd ! I've responded to your points. |
compose.yaml
Outdated
| ARROW_S3: "OFF" | ||
| ARROW_SUBSTRAIT: "OFF" | ||
| ARROW_WITH_OPENTELEMETRY: "OFF" | ||
| PYARROW_TEST_ANNOTATIONS: "ON" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is worth it if instead of using a separate ci/scripts/python_test_type_annotations.sh we use ci/scripts/python_test.sh and we use the PYARROW_TEST_ANNOTATIONS variable to manage whether we want annotations to be tested or not as part of python testing.
If we decide we want to have two separate scripts I think this env var should be removed.
I lean towards having a single python test script that can allow us to also test annotations but I can find arguments for both so not an issue.
| # We first populate stub docstrings and then build the wheel | ||
| python setup.py build_ext --inplace | ||
| python -m pip install griffe libcst | ||
| python ../dev/update_stub_docstrings.py pyarrow-stubs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not necessary, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as #48618 (comment). It's a noop and we don't have to introduce python logic in stub PRs if we do here.
| @REM We first populate stub docstrings and then build the wheel | ||
| %PYTHON_CMD% setup.py build_ext --inplace | ||
| %PYTHON_CMD% -m pip install griffe libcst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until we have real annotations this isn't necessary, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added running dev\update_stub_docstrings.py, so I suppose it's a noop now and we can keep it to make sure it doesn't fail before introducing actual stubs? This will keep CI logic out of stub PRs.
change bat lint add a popd and nicer logging for windows ReplaceElipsis -> DocstringInserter simplify remove sphinx
5ce9011 to
cebbfcd
Compare
|
@kou could you take a quick look at this? |
.github/workflows/python.yml
Outdated
| - name: Test annotations | ||
| shell: bash | ||
| env: | ||
| PYARROW_TEST_ANNOTATIONS: "ON" | ||
| run: ci/scripts/python_test_type_annotations.sh $(pwd)/python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't object this approach but it may be better that we use pre-commit for type checking. (I think that type checking is one of lint checks.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a nice idea! So instead of a new workflow step you are proposing we move these checks into pre-commit script, correct? I'll change to pre-commit approach.
One thing I wonder about - are there users that run all these checks locally and would not like the new extra jobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, we need compiled pyarrow to run type annotation checks against and that is not practical for pre-commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's too bad. I reverted back to workflow approach, but only put the annotation job into a single workflow now.
| """Type stubs for PyArrow. | ||
| This is a placeholder stub file. | ||
| Complete type annotations will be added in subsequent PRs. | ||
| """ | ||
|
|
||
| from typing import Any | ||
|
|
||
| def __getattr__(name: str) -> Any: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this isn't intended to be permanent.
But I thought you might want to consider adding a note/opening an issue to remove it before release.
Here's some examples of how it can mask runtime errors:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, interesting I didn't consider this one! Yes this is meant as a temporary and we would want it out before the release. I'll make a note and issue, thanks for pointing it out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Docs failure is from the Pandas issue. |
Rationale for this change
This is the first in series of PRs adding type annotations to pyarrow and resolving #32609.
What changes are included in this PR?
This PR establishes infrastructure for type checking:
py.typedmarker to enable type checkingAre these changes tested?
No. This is mostly a CI change.
Are there any user-facing changes?
This does not add any actual annotations (only
py.typedmarker) so user should not be affected.